LINEAR FORMS AND QUADRATIC UNIFORMITY FOR FUNCTIONS 

ON F™ 



W.T. GOWERS AND J. WOLF 

Abstract. We give improved bounds for our theorem in |GW09aj . which shows that a 
system of linear forms on F™ with squares that are linearly independent has the expected 
number of solutions in any linearly uniform subset of F™. While in |GW09aj the depen- 
dence between the uniformity of the set and the resulting error in the average over the 
linear system was of tower type, we now obtain a doubly exponential relation between the 
two parameters. 

Instead of the structure theorem for bounded functions due to Green and Tao |GrT08a) , 
we use the Halm-Banach theorem to decompose the function into a quadratically struc- 
tured plus a quadratically uniform part. This new decomposition makes more efficient 
use of the U 3 inverse theorem [GrTOSa . 
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1. Introduction 

In |GW09a] we asked which systems of linear equations have the property that one can 

guarantee that any uniform subset of F™ contains the "expected" number of solutions. By 

the "expected" number we mean the number of solutions one would expect in a random 

subset of the same density, and by a "uniform" subset of F" we mean a set A of density S 

such that the function /a^e) = 1a (x) ~ 8 has small U 2 norm. 

l 
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There turns out to be a clean characterization of such systems: the number of solutions 
can be controlled by the U 2 norm in the above sense whenever the system of linear forms 
L±, . . . , L m is square independent, by which we mean that the functions L\, . . . , I? m are 
linearly independent over F p . The main result of |GW09a] was the following. 

Theorem 1.1. Let L±, . . . , L m be a square-independent system of linear forms in d vari- 
ables of Cauchy-Schwarz complexity at most 2. For every e > there exists c > such 
that f : — > [—1, 1] is any function with \\f\\u 2 < c > then 

m 

\[f{Li{x))\<e. 

i=i 

The statement about the number of solutions in a uniform set A C Wl of density a can 
be recovered by setting / equal to the "balanced function" f^ defined above. 

We encourage the reader to consult the introduction of |GW09a] for a detailed discussion 
of the context of this result, and to ignore the additional assumption of Cauchy-Schwarz 
complexity 2 in Theorem 1 1.1 1 for the moment. We shall not define the term here but simply 
remark that it is a straightforward condition that allows us to say that the average under 
consideration is stable under small perturbations in the U 3 norm. 

In |GW09aj we defined a linear system L±, . . . , L m to have true complexity k if k is the 
least integer such that the U k+l norm controls the average 

m 

\f{U{x)). 

i=l 

It is not difficult to see that for any square-dependent system one can construct a uniform 
set A C F^ that contains significantly more than the expected number of solutions to the 
linear system. The proof is a generalization of the well known example of a uniform set that 
contains significantly more than the expected number of 4-term arithmetic progressions, 
which is based on the identity 

x 2 - 3(x + d) 2 + 3(x + 2d) 2 - (x + 3d) 2 = 0. 



Combining this fact with Theorem II. 1[ we obtain the result that a square-independent 
linear system has true complexity equal to 1. 

We conjectured in |GW09a] that the true complexity was always equal to the least integer 



k such that the functions L\, . . . , L m are linearly independent over F p . 

This paper is the first in a series of three in which we expand on our result in |GW09aj . In 
the second paper |GW09b] we resolve the above conjecture, in a qualitative sense at least, 
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for all (reasonable) systems of linear equations over F" and in the third paper |GW09c] 
we extend Theorem II. II to the technically more challenging setting of Zjv- 

In the present paper we derive a significant improvement over the bounds for Theorem ll.il 
obtained in |GW09aj . which were of tower type. Here we obtain the uniformity parameter 
doubly exponential function in the error e. 

Theorem 1.2. In Theorem \l.l\ the uniformity parameter c can be taken to be 

exp(-exp(c miP e- (4Cp)m )), 

where c m ^ is a constant that depends on m and p only, and C p is a constant depending on 
p only and arising in the U 3 inverse theorem. 

The quantitative improvement which Theorem 11.21 provides over Theorem 11.11 is based 
on a new type of decomposition of a bounded function into quadratic phases. Instead of 
the structure theorem for bounded functions due to Green and Tao |GrT08aj . we use the 
Hahn-Banach theorem to decompose the function into a quadratically structured plus a 
quadratically uniform part. 

This new decomposition makes more efficient use of the U 3 inverse theorem. Moreover, 
it provides a model for our more difficult proof in the cyclic group Z^r |GW09cj . In this 
respect, we are following a course that has been strongly advocated by Green |Gr05] . In 
that paper, we also obtain a doubly exponential bound, by following the arguments in this 
paper as closely as we can, but replacing subspaces with the technology of regular Bohr 
sets that originated in the work of Bourgain. 

Just before we submitted this paper, Green and Tao |GrT10] proved the case of the 
full conjecture. Consequently the problem is now, at least in a qualitative sense, completely 
solved in both settings. However, the main point of this paper is the strong bounds we 
obtain (since we have already proved the result with a much worse bound). 

2. A SIMPLE DECOMPOSITION INTO QUADRATIC PHASES 

As in our previous paper |GW09a] . our starting point will be the following inverse the- 
orem of Green and Tao |GrT09aj (in the case p > 2) and Samorodnitsky |S07] (when 

p = 2). 

Theorem 2.1. Let < 5 < 1 and let f : F™ — > C be a function with \\f\\oo < 1 and 
||/||;73 > 5. Then there exists a quadratic form q : F" — >• F p such that 

\E x f{x)u q{x) \ > exp(-C p r^). 
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Here, C p is a constant that depends on p only. 

Green and Tao use the above theorem to decompose an arbitrary function / into two 
parts fi and /2, where fi is quadratically uniform and fx is quadratically structured, in 
the sense that one can partition F™ into a small number of quadratic subvarieties on each 
of which fx is constant. In this paper, we shall take a somewhat different approach, more 
closely analogous to the way conventional Fourier analysis is used to prove Roth's theorem. 
That is, we shall simply decompose / into a sum of functions of the form u qi , where the 
are quadratic forms, plus an error that we can afford to ignore, and then calculate directly 
using this expansion of /. 

A big difference between the expansion we shall obtain and the expansion of a function 
into Fourier coefficients is that there does not seem to be a canonical way of doing it, 
because there are far more than p n different functions of the form u q . (In harmonic- 
analysis terms, we are dealing with an "overdetermined" system.) This creates difficulties, 
which Green and Tao dealt with by projecting onto "quadratic factors" . Here we shall deal 
with them by applying the Hahn-Banach theorem for finite-dimensional normed spaces. 

It turns out that the Hahn-Banach theorem is a very useful tool in additive combinatorics 
that can be used to prove a large variety of decomposition and approximate structure 
theorems. It also yields a simplified proof of Green and Tao's transference principle, which 
was a crucial ingredient in the proof that there exist arbitrarily long arithmetic progressions 
in the primes |GrT08bj . These results are discussed at length in [G08] . 

Before we can explain why the Hahn-Banach theorem is useful, we must state both it 
and one or two other simple results about duality in normed spaces. Throughout this 
section we shall refer to an inner product, which is just the standard inner product on C n 
(or later C F p). 

Theorem 2.2. Let X = (C n , ||.||) be a normed space and let x G X be a vector with 
||x|| > 1. Then there is a vector z such that \ (x,z)\ > 1 and such that \ (y, z)\ < 1 whenever 

\\y\\ < i- 

Apart from Theorem l2.2l a proof of which can be found in any standard text on functional 
analysis, we have aimed to keep this paper completely self-contained. We start by recalling 
some standard notions from the theory of normed spaces. The dual norm ||.||* of a norm 
|.| on C n is defined by the formula 

||z||* = sup{|(x, z)\ : ||x|| < 1} 
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For technical reasons, we shall generalize this concept to the situation where the norm |.| 
is defined on a subspace V of C n . Then the dual is a seminorm, given by the formula 

||z||* = sup{|(x, z)\ : x G V, \\x\\ < 1} 

The next lemma is a standard fact in Banach space theory. 

Lemma 2.3. Let k be a positive integer, and for each i between 1 and k let \\.\\i be a norm 
defined on a subspace V\ of C n . Suppose that V\ + • • • + V& = C n , and define a norm \\.\\ 
on C n by the formula 

\\x\\ = inf{||xi||i H h \\x k \\ k : X\ H h x k = x] 

Then this formula does indeed define a norm, and its dual norm ||.||* is given by the formula 

WAV = max {lklli ; --->lkllfc} 

Proof. It is a simple exercise to check that the expression does indeed define a norm. 

Let us begin by supposing that ||z||* > 1 for some i. Then there exists iGVi such that 
< 1 and | (x, z) \ > 1. But then ||x|| < 1 as well, from which it follows that \\z\\* > 1. 
Therefore, \\z\\* is at least the maximum of the ||z||*. 

Now let us suppose that \\z\\* > 1. This means that there exists x such that ||x|| < 1 
and | {x, z) | > 1 + e for some e > 0. Let us choose xi, . . . , x k such that G Vi for each i, 
x\ + • — V x k = x, and ||xi||i + • • • + \\xk\\k < 1 + e. Then 

\{xj, z)\ > ||xi||i H h \\x k \\k 

i 

so there must exist i such that \(xi,z)\ > \\xi\\i, from which it follows that ||z||* > 1. This 
proves that ||^||* is at most the maximum of the \\z\\*. □ 

Corollary 2.4. Let k be a positive integer and for each i < k let ||.||j be a norm defined 
on a subspace Vi of C n , and suppose that Vi + ■ ■ ■ + Vk = C n . Let or, . . . , atk be positive 
real numbers, and suppose that it is not possible to write the vector x as a linear sum 

X\ H hXfe in such a way that X{ G Vi for each i and ai||xi||i + ■ ■ • + afcH^Hfc < 1. Then 

there exists a vector 2 £ C such that \(x,z)\ > 1 and such that \\z\\* < cti for every i — or 
equivalently, \(y, z) \ < ati for every i and every y G Vi with \\y\\i < 1. 

Proof. Let us define a norm |.| by the formula 

||x|| = inf{ai||a:i||i H h « fc ||a;fc||fc : x\ H h x k = x} 
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Then our hypothesis is that ||x|| > 1. Therefore, by Theorem 12.21 there is a vector z such 
that \(x, z)\ > 1 and \(y, z)\ < 1 whenever \\y\\ < 1. 



The second condition tells us that ||^||* < 1, and Lemma 12.31 applied to the norms 
ct!j||.||j, tells us that ||z||* is the maximum of the numbers a" 1 ^!!*. Therefore, \\z\\* < a- t 



Recall that the difficulty we are trying to deal with is that there is no (known) canonical 
way of decomposing a function into functions of the form u q . Corollary 12.41 is extremely 
helpful for proving the existence of decompositions under these circumstances. Instead of 
trying to find a decomposition explicitly, one assumes that there is no such decomposition 
and uses Corollary 12.41 to derive a contradiction. The next result illustrates the technique. 

Theorem 2.5. Let f : F™ — >• C be a function such that \\fW2 < 1- Then for every 5 > 
and rj > there exists M such that f has a decomposition of the form 



In fact, M can be taken to be exp(C p (r]5) Cp ), where C p is the constant in Theorem \2.1\ 

Proof. Suppose not. Then for every quadratic form q on F™ let V(q) be the one-dimensional 
subspace of C F ? generated by the function u q , with the obvious norm: the norm of Xu q 
is |A|. 

Applying Corollary 12.41 to these norms and subspaces and also to the L\ norm and U 3 
norm defined on all of C F ?, we deduce that there is a function : F" — > C such that 
\{f,4>)\ > 1, ||0||oo < t?" 1 , 11011^3 < and \((p,u q )\ < M^ 1 for every quadratic form q. 

Now the fact that |(/, 0)| > 1 implies, by Cauchy-Schwarz, that ||0||2 > 1- But we also 
know that (0,0) < ||0||{/3 1|0||^ 3 , so ||0||c/3 > 5. Applying the inverse theorem to 7/0, we 
find that there is a quadratic form q such that |(0, u q )\ > exp(—C p (r]8) 1 '), contradicting 
the fact that it has to be at most M _1 . □ 

Just before we continue, let us briefly discuss a more obvious approach to a slight variant 
of Theorem 12.51 and see why it does not work. Theorem 12.11 tells us that every bounded 
function / with large U 3 norm correlates well with some function of the form u q . So one 
might assume that / is bounded and try a simple inductive argument along the following 



for every i, as stated. 



□ 



f(x) = J2^ q ' (x) +9(x) + h(x), 



where the qi are quadratic forms on F™ and 
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lines. If ||/||c/3 is large, then Theorem 12.51 gives us a quadratic form qi such that / correlates 
with u qi . So choose Ai such that ||/ — Aia; 91 ||2 is minimized, and let f\ = f — \iu qi . Because 
of the correlation, ||/i||2 is substantially less than ||/||2- Now repeat for /1, and keep going 
until you reach some k for which ||/fc+i||i7 3 is small. 

The problem with this argument is that we gradually lose control of the boundedness 
of /. As we continually subtract the functions XiU qi the L2 norm goes down, but the 
norm can go up. And L 2 control is not enough for Theorem 12.11 as the example of 
a suitably normalized arithmetic progression shows. It turns out that a variant of the 
inductive argument outlined above can be made to work if one uses a weaker assumption 
than boundedness |C07j (which means that the result proved is stronger). Green and Tao's 
approach to quadratic Fourier analysis assumes an bound for / and uses averaging 
projections, which decrease both the L2 and L ro norms. Thus, there seems to be a genuine 
difference between Theorem 12.51 and their approach. 

However, there are two aspects of Theorem 12.51 that place considerable limits on how 
useful it is. The first is that M is rather large, so that bounds that depend on the theorem 
tend to be rather large as well. The second, which is more serious, is that there is no useful 
bound on the number of quadratic phase functions used to decompose /. We shall deal 
with these two problems in turn. 

3. Introducing quadratic averages 

In order to reduce M, we shall use a slightly stronger form of Theorem 12. 11 which Green 
and Tao |GrT08a] mention but do not need, and therefore do not formally state. To begin 
with, here is a variant that they do state. If V is a subspace of F™ and j/GFJ, then they 
define a seminorm H-H^^+y) on functions from F™ to C by setting 

||/||u3(i/+v) = sup \E xey+v f(x)u- q{x) \, 

q 

where the supremum is taken over all quadratic forms q on y + V. 

Theorem 3.1. Let f : F2J — > C be a function such that ||/||oo < 1 o,nd WfWu 3 > Then 
there exists a subspace V of F" of codimension at most (2/5) Cp , where C p is a constant 
that depends only on p, with the property that 

E y \\f\\ u3{y+V) > (5/2f?. 

One can deduce Theorem 12. II very simply from this version: by an averaging argument, 
there must exist y such that / correlates well on y + V with some quadratic phase function 
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oo q ; this function can be extended to a function on the whole of F™ in many different ways, 
and a further averaging argument yields Theorem 12.11 

It turns out that, as Green and Tao remark, a slightly more precise theorem holds. The 
result as stated tells us that for each y we can find a local quadratic phase function u qy 
defined on y + V such that the average of \¥ x€y+v f( x ) UJ ' ly ^\ is & t least (5/2) c . However, 
it is actually possible to do this in such a way that the "quadratic parts" of the quadratic 
phase functions q y are the same. More precisely, it can be done in such a way that each 
q y (x) has the form q(x — y) + (f> y (x — y) for some quadratic function q : V — > ¥ p (that is 
independent of y) and some linear functionals <p y : V — > ¥ p . A similar statement can be 
read out of [S07j for the case of F£ (see also [W09] ). 

This will be convenient to us later, so let us make a definition so that we can refer to 
this property concisely. 

Definition 3.2. Let V be a subspace of¥ p and let q be a quadratic form onV.A quadratic 
average with base (V, q) is a function of the form Q(x) = ¥ yex ^v^ qy ^ x \ where each function 
q y is a quadratic map from y + V to ¥ p defined by a formula of the form q y (x) = q(x — 
v) + 02/ — v) f or som e Freiman homomorphism <p y : V — > ¥ p . The rank of Q is the rank 
of the quadratic form q, and the complexity of Q is the codimension ofV. 

Notice that if Q is a quadratic average then ||<5||oo < 1- For fixed x it is natural to write 
the set where V(x — y) is defined as x — V , but since V is symmetric this is equal to x + V. 
We write it as x — V because writing at as x + V makes certain proofs slightly confusing. 

Before we move on let us describe a particular property of quadratic averages which 
will be used on a number of occasions in the sequel. It is a consequence of the fact that 
the quadratic phases are somewhat "parallel" . In order to become familiar with quadratic 
averages, let us prove that high-rank quadratic averages have small U 2 norm. We begin 
with a standard fact about Gauss sums. Since the proof is short, we include it for the sake 
of completeness. 

Lemma 3.3. Let q be a quadratic form of rank r and let § be a linear function. Then 

lE^o^+^l < p~ r/2 . 

Proof. Let (3 be the bilinear form given by the formula (3(x,y) = q{x + y) — q(x) — q(y). 
Then 
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For each y, the expectation over x is zero unless (3(x, y) is constant, and therefore identically 
zero, in x. But for such a y we have q(y) = as well, so the expectation over x is u^ y \ 
which has modulus 1. 

Since q has rank r, the space of y such that (3(x,y) is zero for every x has codimension 
r, so the right hand side is at most p~ r . This proves the lemma. □ 

Note that in fact we have the more precise result that the expectation is zero if <fi does 
not vanish on the annihilator of (3, and p~ r l 2 otherwise. 

Lemma 3.4. Let Q be a quadratic average of rank r. Then \\Q\\u 2 < p~ r ^ ■ 

Proof. Let V be a subspace of F™ and let <pi, <p 2 , 4>3 and 04 be linear functions defined on 
V. Let q be a quadratic form on V of rank r and let qi = q + <pi. Then 

F t qi{x)-q2{x+a)-q 3 {x+b)+ qi {x+a+b) _ m 8{a,b) 

where f3 is a bilinear form of rank r. For each a, the expectation over b is zero unless (3 (a, 6) 
is zero for every 6, which happens only when a belongs to an r-codimensional subspace of 
V. Therefore, the right-hand side equals p~ r . 

Now let Q be a quadratic average with base (V, q). Then 

HQUja = ^x 1+ x 2 =x 3 +x 4 Q(xi)Q(x 2 )Q(x 3 )Q(x^. 

If we condition on the right-hand side according to which translates x±, x 2 , x 3 and 24 
belong to, we obtain an expectation of expectations of the form discussed in the previous 
paragraph. Indeed, let 2/1 + 2/2 = 2/3 + Ua and let us calculate the average over all X\ + x 2 = 
£3 + £4 such that Xj G 2/j + V. Inside yi + V, we have Q(x) = u q ^ x ~ yi ' + ^ v i^ x ~ v ^ for some 
Freiman homomorphism 0j. Thus, the average in question is 

TO , Mx 1 -y 1 )+4 >yi {x 1 -y 1 )+q{x 2 -y2)+<t>y 2 {x2-y2)-q(x3-y 3 )-(t>y 3 {x 3 -y- i )~q{x A -y A )-<j> y4 (x i ~y A ) 

1L -'X 1 +X2=X3+X4<* U ) 

where the average is over x, G ?/; + V. we can then substitute by setting a new Xj to be 
Xi — yi and we have an average of the required form. The result follows. □ 

It may seem slightly strange that the above result does not depend on the complexity 
of the quadratic averages Q. But this is because the assumption about the rank of q is 
stronger the larger the codimension of V. Thus, there is in fact a dependence but it is 
disguised by the way the lemma is formulated. 
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Now we can state a slightly more precise version of the inverse theorem that follows 
easily from the remark of Green and Tao when p > 2, and follows with some care from the 
work of Samorodnitsky in the case p = 2 |W09] . 

Theorem 3.5. Let f : F™ — > C be a function such that \\f\\oo < 1 and \\f\\u^ > 8- Then 
there exists a quadratic average Q of complexity at most (2/8) p such that 

\(f,Q)\>(8/2) Cp /2. 

Proof. The results of Green and Tao tell us that we can find a subspace V satisfying the 
above codimension bound, a quadratic function q defined on V, and for each y a linear 
map (f) y : V — > F p , such that, defining q y (x) = q(x — y) + <p y (x — y) on y + V, we have 

K y \E x£y+v f(x)uj-^\ > (5/2)°-. 

For each function q y we can add a constant X y G F p without affecting the left-hand side. 
We can choose this constant so that 

K(Kev + vf(x)u^ +x ») > ±\E xey+v f(x)u,-*>to\. 

Therefore, after suitably redefining the functions q y and setting Q(x) = ¥, y( z x ^yUJ qy ^ x \ we 
have 

\(f,Q)\ > R(E x E yex _ v f(x)uj-^) > h& y \E xey+v f(x)u-^% 
which proves the theorem. □ 

Now let us quickly deduce a corresponding decomposition result by the same method as 
before. 

Theorem 3.6. Let f : F™ — > C be a function such that \\fW2 < 1- Then for every 5 > 
and i] > there exists M such that f has a decomposition of the form 

f( x ) = **Qi( x ) + 9(x) + h(x), 

i 

where the Qi are quadratic averages on F" of complexity at most (2/S) Cp , and 

i 

In fact, M can be taken to be (2/rjS) Cp /2. 

Proof. Suppose not. Then for every quadratic average Q on F^, let V(Q) be the one- 
dimensional subspace of C ¥ p generated by Q with the norm of XQ defined to be |A|. 
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Applying Corollary 12.41 to these norms and subspaces, and also to the L\ norm and 
jj3 norm defined on all of C ¥ p , we find a function : F™ — > C such that \(f,<f>)\ > 1, 
\\4>\\oo < ||0||^3 < an d |(0,<5)| < for every quadratic average Q. 

As before, our assumptions imply that ||0||c/3 > 5. Applying the more precise inverse 
result (Theorem 13.51) to r](f), we find that there is a quadratic average Q of complexity at 
most (2/8) p such that |(0, Q)\ > (r]5/2) Cp , contradicting the fact that this correlation has 
to be at most M -1 . □ 



4. Clusters of highly correlated quadratic phases 

Now let us deal with the difficulty that the above decomposition may be into a huge 
number of quadratic averages. Corollary 14.21 below is a simple result that will be used to 
place a strong restriction on the decompositions that can occur. 

To prepare for it, let us define a vertex-weighted graph to be a graph G together with a 
function /i : V{G) — > M + . The weights \x induce an obvious weighting of the edges: if xy is 
an edge of G then its weight is Finally, if if is a subgraph of G, then we define 

the total weight of H to be the sum of the weights of all the edges in H. 

Lemma 4.1. Let C and 7 > be constants, and let G be a weighted graph with total weight 
at most C. Then there exist vertices x±, . . . ,Xk of G, with k < C 2 /2^, such that if A is 
the set of all vertices not joined to any of the Xi, then the total weight of the subgraph of 
G induced by A is at most 7. 

Proof. If the total weight of G is at least 7, then J2 y eN{x) — 7/2; where we 

write N(x) for the neighbourhood of x. Since J2 x (i(x) < C, it follows that there exists x 
such that ^2 yeN r x \ ^(y) > 7/2C. Let X\ be a vertex with this property and let G\ be the 
subgraph of G induced by the vertices not in N(xi). 

Now repeat this argument for G\, and so on. Since at each stage we remove a set of 
vertices with weights summing to at least 7/2C, the process cannot continue for more than 
C 2 /27 steps, at which point the graph induced by the complements of all the neighbour- 
hoods we have chosen has total weight at most 7, as claimed. □ 

Corollary 4.2. Let u\, . . . ,u n be a collection of vectors of norm at most 1 in a Hilbert 
space H , let Ai, . . . , A n be scalars with YH=i M — C an d let 5 > 0. Then there are vectors 
U{ 1 , . . . , Ui k and a set A C {1, 2, . . . , n} such that k < C 2 /5 2 , and with the following prop- 
erties. For every i ^ A there exists j such that \{u^ «^.)| > 5 2 /2C 2 , and || YlieA II2 — 
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Proof. Define a vertex- weighted graph by taking the vectors U\,...,u n as the vertices, 
letting the weight of Ui be |A»|, and joining to Uj if and only if \(ui,Uj)\ > 5 2 /2C 2 . 
Now we apply the previous lemma with 7 = 5 2 /2. It gives us vectors u^, . . . ,Ui h , with 
k < C 2 /S 2 , such that if A is the set of all % for which there is no ij with \ {v,i, Ui.)\ > 5 2 /2C 2 , 
then the sum of all |Aj||Aj| with i,j £ A and |(uj,itj.)| > 5 2 /2C 2 is at most S 2 /2. But 

and we can split the last sum into two parts, according to whether \{v,i, Uj)\ is at least 
d~ 2 /2C 2 or less than S 2 /2C 2 . The first part is at most 5 2 /2, since we always have | (itj, Uj) \ < 
1, and the second is at most C 2 S 2 /2C 2 = d~ 2 /2. This proves the result. □ 

We are interested in sums of the form £\ AjWj where the Ui are built out of polynomial 
phase functions. These have the property that if two of them have a significant correlation 
then they must have a strong algebraic relationship. For example, if two linear phase 
functions have any correlation at all, then they must be equal up to a scalar multiple, and if 
two quadratic phase functions are well correlated, then the difference of the corresponding 
quadratic forms must have low rank. Thus, Corollary 14.21 immediately implies that if 
fi = \iCU qi , then we can write fi as / 2 + g, where / 2 is composed of a few clusters of 
quadratic phase functions that do not differ except in a "linear" way, and g is small in L 2 . 

Now let us make this remark precise. We start off by finding out what it means for two 
quadratic averages to be well correlated. 

Lemma 4.3. Let Q and Q' be quadratic averages with bases (V, q) and (V, q'), respectively. 
Suppose that the rank of q — q' considered as a quadratic form on V (1 V is r . Then 

\(Q,Q')\<P~ r/2 . 

Proof. By the remarks following Definition I3.2[ we know that on each translate of V H V 
the function QQ' is given by a formula of the form us q ~ q '~^, where is a linear function. 
Therefore, by Lemma I3T31 the expectation of QQ' over each such translate has modulus at 
most p~ r l 2 . The result follows. □ 

Once again, the apparent lack of any dependence on the complexities of Q and Q' is an 
illusion: the higher the complexity, the stronger the rank assumption. 

We now examine the U 2 dual norm of low-rank quadratic averages. First we need a 
rather crude lemma. Given a subspace W of F" we define W 1 - to be the space of all r 
such that r T x = for every x £ W. 
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Lemma 4.4. Let W be a subspace of F™ of codimension d, let y £ W and let (J) be a 



Proof. Since the U 2 dual norm is unaffected by translation and by multiplying by a linear 
phase function, we can assume that y = and is the zero function. Thus, we are 
calculating the U 2 dual norm of the characteristic function of W. The Fourier transform 
of this function takes the value p~ d at every y £ W ± and everywhere else. The U 2 
dual norm is the £ 4,/3 norm of this Fourier transform, which is (p d p- 4d / 3 ) 3 / 4 ; which equals 



Before we continue, here is an alternative proof of Lemma 14.41 that does not use the 
Fourier transform. This will be useful later when we want to generalize it. One observes 
first that g(x) = p 2d K z+w ^ y=x g(z)g(w)g(y). From this it follows that, for any function 



which has modulus at most p M ||/||c/2||5'||^2- It is easy to prove (without Fourier analysis) 
that ||g||^2 = p~ 3d , so \{f,g}\ < P _ ^ 4 ||/||c/ 2 , which proves the lemma. We remark that the 
inequality is sharp because (g,g) = P~ d ^\\g\\u 2 - 

Lemma 4.5. Let V and V be subspaces o/F™, and let q and q' be quadratic forms defined 
on V and V , respectively. Suppose that the codimension ofVOV' is d and the rank of the 
restriction of q — q' to V n V is r. Let Q and Q' be quadratic averages with bases (V,q) 
and (V, q'), respectively. Then WQQW^ < p 3 ( d+r )/ 4 . 

Proof. For any fixed y, the restriction of the function QQ' to y + V D V is equal to uj q ~ ql+ 't ) 
for some linear function 0. Since the rank of q — q' is r, there is a subspace W C V D V, 
of codimension r in V D V, such that the function u q - q is constant on those translates of 
W that live inside y + VdV. Thus the restriction of QQ' to any translate of W that lives 
inside y + V PI V is a linear phase function. By Lemma I4.4[ each of these restrictions has 
U 2 dual norm p~( r+d )/ 4 ; so their sum, which is the restriction of QQ' to y + V H V, has U 2 
dual norm at most p r -( r + d )/*. The same is true for all translates of V fl V, of which there 
are p d . Therefore, the U 2 dual norm of / is at most p 3 ( r+d )/ 4 , as claimed. □ 



V 



d/A 



□ 



/ : F p -> C, 



(g,f) =P 2 ^x+ y =z+ w f(x)g(y)g(z)g(w), 



Now let us put these facts together in order to obtain a more sophisticated decomposition 
into linear combinations of quadratic averages. 
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Theorem 4.6. Let f : F" — > C be a function such that ||/|| 2 < 1, and Zet 6 > 0. Lei 

d = (2/S) Cp and C = (2/5 2 ) Cp . Then f has a decomposition 

k 

f( x ) = Qi( x ) u i( x ) + g( x ) + K x )> 

i=l 

where k < C 2 /5 2 , the Qi are quadratic averages on Ei=i Wi\\u 2 — 2 3//2 C 4 5~ 3 p 3d//4 ; 
Eti INU < C, \\g\\ x < 25 and \\h\\u* < 5. 

Proof. We begin by using Theorem 13. 6 1 to decompose / into a sum Ej XiQi(x)+g'(x)+h(x), 
where each Qi is a quadratic average of complexity at most (2/5) Cp , and \\g'\\i < 5, \\h\\us < 
8 and E* \Xi\ < (2/5 2 ) Cp . Let us set d to equal (2/5) Cp and C to equal (2/5 2 ) Cp . 

Now we apply Corollary 14.21 to the linear combination Ei \Q%- Without loss of gen- 
erality, the functions that it gives us are Qi, . . . ,Qk- Hence we can write Ei ^iQi m the 
form Y!L\Qi u i + where k < ° 2 / 6 ^ Wh < 6 and each u i 

is a function of the form 

J2jeAi ^jQiQi wi th \(Qi,Qj)\ > S 2 /2C 2 . 

By Lemma 14.31 it follows that for each j G A$ the rank r of the quadratic form q,i — qj 
is such that p r l 2 < 2C 2 /5 2 . Since the complexity of each Qi is at most d, Lemma T4.5I then 
tells us that \\Q~iQj\\m is at most {2C 2 /5 2 f' 2 p 3d l A . Since Eti Eje^ l A il < c > lt follows 
that Eti INI^ < 2 3 / 2 C 4 5-V d/4 and Eti INU < C. Since < |kj"|| 2) the result 

is true with g = g' + g" ■ □ 

By making some very minor changes to the argument it is possible to gain independent 
control over the L\ and the U 3 error in this approximation, but we shall not need to do so 
here. 



5. A STRONGER DECOMPOSITION FOR HIGHLY UNIFORM FUNCTIONS 

Later on, we shall need to use the fact that if a function / is uniform, then the quadratic 
averages used to decompose it can all be taken to have high rank. This result is very 
plausible, as low-rank quadratic averages are anti-uniform and should therefore not be 
necessary, but there does not seem to be a truly short proof of this fact. 

The next lemma shows that one can split any function of the type Ei=i QiUi into a 
low-rank part and a high-rank part in such a way that there is a substantial gap between 
the ranks in the two parts. The proof is very short, mainly because the work has already 
been done: in order to have a useful result we are relying on the fact that k is small. 
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Lemma 5.1. Let R , m and t > 1 be constants. Let Qi, Q 2 , . . . ,Qk be quadratic averages. 
Then there is a partition of {1,2, ... ,k} into two sets L and H, and a constant R £ 
[Rq, m k (Ro + 1)} , such that the rank of Qi is at most R for every i £ L and at least mR + 1 
for every i £ H . 

Proof. Without loss of generality the Qi are arranged in increasing order of rank. If there 
is no % such that Qi has rank at least vtl 1 [Rq + t), then let L — {1, . . . , A;}, and we are done. 
Otherwise, for each j let Rj = m? R + (m- 7-1 + m?~ 2 + ■ ■ • + l)t and let i be minimal such 
that Qi has rank at least Ri. Since Ri = mRi-i + t and Rk < m h (Ro + t), we are done. □ 

The next lemma is a standard application of Bogolyubov's method, and shows that anti- 
uniform functions are well approximated by their convolution with a low-codimensional 
subspace V. Alternatively, in the language of Green and Tao, which we shall not be using 
here, anti-uniform functions are well-approximated by their projection onto a suitable low- 
complexity linear factor. We write \xy for the characteristic measure of this subspace V, 
which has the property that E x /xy(x) = 1. 

Lemma 5.2. Let 5 > and T be constants, let / : F™ — > C and suppose that ||/||^2 < T. 
Then there is a linear subspace V of codimension at most 5~ A T A such that \\f — f*Hv\\2 < 

Proof. Our assumption can be rephrased in Fourier terms as the assertion that H/H4/3 < T. 
By the Fourier inversion formula, we know that f(x) = Yl r f( r ) UJ ~ rTx f° r every x. Let 
p = S 3 T~ 2 and let K = {r : \f(r)\ > p}. 

Let V be the subspace of all x such that r T x = for every r £ K. Since H/H4/3 < T, 
we know that \K\ < T 4 / 3 p _4//3 . Therefore, V has codimension at most (T/p) 4//3 = 5~ 4 T 4 . 
Then we can decompose / as a sum f x + f 2 , where fi(x) = J2 r &K f( r )^~ rTx and f2(%) = 
Yl r dK f(r)u~ rTx . It is easy to see that f\ = f * fly, and we can bound the L 2 norm of / 2 
as follows. 

ii/ 2 iihii/ 2 ii^<ii/ 2 ii:;3ii/ 2 ii^ 3 <p 2/3 ^ 4/3 =^ 

Therefore, the statement of the lemma follows from our calculations. □ 

It follows from Lemma 15.21 that the product of a low-rank quadratic average with an 
anti-uniform function is also well approximated by its convolution with a suitable subspace 
of bounded codimension. 

Corollary 5.3. Let 5 > andT be constants, letU : F™ — > C and suppose that ||t/||^2 < T. 
Let r and d be constants, and let Q be a quadratic average of rank at most r and complexity 
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at most d. Let f = QU. Then there is a linear subspace V of codimension at most 
<5" 4 T 4 + r + p d such that \\f - f * fivh < S. 

Proof. By Lemma 15.21 there is a subspace Vq of codimension at most <5 _4 T 4 such that 
\\U — U * /Uy || 2 < 5. Let Q have base (Vi,q). Since q has rank at most r, V± contains a 
subspace W\ of codimension at most r such that u q is constant on each translate of W\. 
Therefore, for each y the function constant on those translates of W\ that are 

subsets of y + V\. 

Now on y + V\ the quadratic average Q(x) takes the form ufa^+tvb-v) for some linear 
form (f) y . There is a partition of y + V\ into p affine subspaces of codimension 1, on each of 
which cf) y is constant, from which it follows that there is a subspace W2 of codimension p d 
on which all of the functions <p y are constant. 

Putting these two facts together, we obtain a subspace V 2 of codimension at most r + p d 
such that Q is constant on cosets of V2. 

Now let V = Vo H V2, so that V has codimension at most 5~ 4 T A + r + p d . Since 1/ is a 
subspace of Vo, we have that \\U — U * \iy \i < 5. Now Q has constant modulus 1, so that 
\\QU — Q{U * /xv) || 2 < 5, and since Q is constant on cosets of V, Q(£/ * /iv) = (<5^) * 
The result is proved. □ 

Finally, let us prove a couple more easy technical lemmas on how various L p and unifor- 
mity norms interact with the convolution operator. 

Lemma 5.4. Let V be a subspace ofF™ and let f be a function from to C. Let \\.\\ be 
any translation-invariant norm defined on such functions. Then \\f * [j,y\\ < ||/||. 

Proof. If we write f v (x) for f(x + v ), we find that / * \iy = ¥, v& yf v and we know that all 
the functions /„ have the same norm as /. The lemma therefore follows from the triangle 
inequality. □ 

Lemma 5.5. Let V be a linear subspace of codimension r on F™ and let f be a function from 
to C that is constant on the cosets ofV. Then \\f\\ V 2 > p- r/4 ||/|| 2 and \\f\\* u2 < p r/4 ||/|| 2 - 

Proof. Let T be a linear map from F^ to F£ with kernel V. Let g : F£ — > C be defined by 
the formula f(x) = g{Tx), which is well-defined since / is constant on translates of V. It 
is easy to see that ||/||2 = \\gW2 and \\f\\u 2 = II^Hc/ 2 - But 

II^H 4 2 = E y \E x g(x)g(x + y)\ 2 > p~ r (E x \g(x)\ 2 ) 2 = p~ r \\g\\t. 

This proves the first part. For the second part, we know that ||/||^2 is the maximum of (/, h) 
over all functions h such that ||/i||(7 2 < 1- Now replacing h by h*fiy does not affect the inner 
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product (/, h) and does not increase ||^||[/2. Therefore, the maximum must be achieved by 
a function h that is constant on the cosets of V . But then \ (f,h)\ < \\fW2WhW2, which is at 
most p r//4 ||/||2||^||c/ 2 , by the first part. This completes the proof of the lemma. □ 

Before stating the main result of this section, let us recall Lemma 3.4 from our previous 
paper |G W09a] . which says that the rank of a bilinear form cannot decrease too much 
when restricted to a smaller subspace. In that paper we gave a proof based on algebraic 
arguments. Here we shall present a different approach, included for the sake of complete- 
ness, which turns out to be more generalizable to higher-degree forms |GW09bj as well as 
locally-defined quadratic forms |GW09c] . 

Lemma 5.6. Let (3 be a symmetric bilinear form of rank r on F™ and let W be a subspace 
ofF™ of codimension d. Then the rank of the restriction of (3 to W is at least r — 2d. 

Proof. We first observe that for any fixed x £ F™, ~E ye ^nco l3( - x ' y ^ is either or 1, and hence 
and similarly, for any fixed y 6 F", E^wu 13 ^^ is either or 1. It follows that 



IWI \W\ 2 



where we have written ry/ for the rank of the restriction of /3 to W, and hence r\y > 
r -2d. □ 

Putting these results together, we obtain the main decomposition theorem that we shall 
apply later to count certain types of linear configurations in uniform sets. It is similar to 



Theorem 14.61 but with the additional hypothesis that / is highly uniform, which allows us 
to draw the stronger conclusion that all the quadratic averages used have high rank. 

Theorem 5.7. Let f : F™ — > C be a function such that \\fW2 < 1- Then for every § > 
there exists a constant C such that for every Rq there exists a constant c with the following 
property. 

Let d = (2/5) Cp and C = (2/5 2 ) Cp . Suppose that \\f\\u 2 < c. Then f has a decomposition 
of the form 

k 

f( x ) = ^2 Qi( x ) U i( x ) + 9{x) + K x )> 
i=l 



18 



W.T. GOWERS AND J. WOLF 



where k < C 2 /5 2 , the Qi are quadratic averages on F™ of complexity at most d, Yli=i II ^» II 172 < 
2 3 / 2 C 4 (5~ 3 p M//4 , Y^i=i \\Ui\\oc < C, \\g\\i < 75 and \\h\lu3 < 25. In addition, each quadratic 
average Qi has rank at least R provided that c satisfies the inequality c < p~P lbdR o . 

Proof. Let us begin by applying Theorem 14.61 to obtain a decomposition of the form 

k 

f(x) = Qi{x)Ui{x) + g'{x) + h'(x), 

i=l 

where k < C 2 /5 2 , the Qi are quadratic averages on F™ of complexity at most d, Ya=i \Wi\\u 2 — 
2 3 / 2 C 4 8- 3 p 3d /\ J2i=i WiWoo < C, \\g'\\i < 25 and \\h'\\ u3 < 5. 

Let us assume that the quadratic averages Qi are arranged in increasing order of rank. 
Now let m = k, let r be such that p~ r l 2 = C 2 /5 2 and let t = p 8d + 2d + 2r. Applying 
Lemma 15.11 we obtain positive integers R G [Rq, m k (R Q + t)} and s G {0, 1,2,..., A;} such 
that Qi has rank at most R when i < s and has rank at least mR + t when i > s. Let 

/l = E"=i Qi u i and !h = Ei= s +i Qi u i- 

Let T = 2 3 / 2 C 4 5~ 3 p 3d / 4 . Theorem 1461 tells us that \\Ui\\* u2 , and hence each individual 
||£/j||^ 2 , is at most T. 

Let rj — 5/k. By Corollary l5.3[ for every i < s there is a linear subspace Vi of codimension 
at most T]~ 4 T 4 + R + p d such that \\QJJi — (QiUi) * n>Vi\\2 < V- Let V be the intersection 
of all the subspaces Vi. Then V has codimension at most /c(?7 _4 T 4 + R + p d ) < kR + p 8d 
for sufficiently small 5, and \\fi — fi * Hv Ih < krj < 5. 

Now let us return to our decomposition / = + fjj + g' + h'. We shall convolve both 
sides with the measure /iy of the subspace V just constructed and consider its effect on 
the L 2 norms of / and If both of these are small, it will allow us to approximate fi 
by a quadratically uniform function up to an error in L\. 

First, since ||/||[/ 2 — c i Lemma [5.41 implies that ||/ * /^vll c/ 2 — c - It follows from Lemma 
15.51 that ||/* /My || 2 — cp^ kR+p8d ^ A , which is at most 5 by our choice of c. Next, let us look at 
fa- For each i > s, the quadratic average Qi has rank at least kR+t. Recall from the proof 
of Theorem 14 . 6 1 1 hat QiUi is equal to a sum of the form EjgAi ^jQj w ith Ei ^2jeAi l^'l — ^ '■ 
For each j, write (Vj, qj) for the base of Qj. Then we had the additional property that for 
every j G A4 the rank of qi — qj, considered as a quadratic form on ViClVj, was at most 
the r chosen earlier — that is, the r such that p r ^ 2 = 2C/5 2 . 

If we consider Qi as a quadratic average with base (Vi D Vj,^), then by Lemma 15.61 
it has rank at least kR + t — 2d, and Qj, considered as a quadratic average with base 
(Vi n Vj,qj) therefore has rank at least kR + t — 2d — r. It follows from Lemma 13.41 
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that HQjllc/ 2 < p-{kR+t-2d-r)/A ^ an( j h ence f r0 m the triangle inequality that ||/#||[/2 — 
J2i> s WQiUiWu 2 < Cp~ { - kR+t ~ 2d ~ r ^ i . Therefore, by the same argument as we used for /, we 
find that 

II f H * nvh < c P - {kR+t - 2d ~ r)/A p {kR+pSd)/A = c P ^ d+2d+r ^/ A = c P - r/4 

By our choice of r, this is less than 5. 

As for g' and h', we know by Lemma [5^41 that ||(/*/iy||i < ll^'lli < 2<5 and ||/i'*/iy||[/3 < 5. 
Since ||/jr — Jl * ^vlh < 8, we have ended up showing that can be written as a sum 
g" + h", where \\g"\\i < 55 and ||/i"||j7 3 < 5. Therefore, we can write / = fn + g + h with 
lis 1 ||i < 75, \\h\\us < 25. This proves the theorem. □ 

6. Proof of Theorem 11.21 

The aim of this section is to show that if / is a function of the form . UjQj , where the Qj 
are high-rank quadratic averages and Y^j Halloo is not too large, then E xG ( ¥ ny Yli f(Li(x)) 
is small in modulus whenever the linear forms Li, . . . ,L m are square independent. It is 
then relatively straightforward to deduce Theorem 11.21 

First, let us prove some lemmas that will help us use the high-rank condition on the 
quadratic averages in conjunction with the square independence of the linear forms. The 
first result states that if the bilinear form (3 has high rank, then the phase function ojP( x >y} 
is quasirandom. 

Lemma 6.1. Let (3 be a bilinear form of rank at least r on a subspace V o/F™, and let g 
and h be two functions with \\g\\oo and ||/i||oo o,t most 1. Then 

\E X!y u^g(x)h(y)\<p"^ 2 . 

Proof. This lemma can be proved either directly (as we shall do) or indirectly, by first 
estimating the rectangle norm of the function and applying standard results in the theory 
of quasirandomness. Either way, the proof is a standard application of the Cauchy-Schwarz 
inequality. 

\E x , y u^g(x)h(y)\ 2 < E x \E y u^ g(x)h(y)\ 2 < E x \E y u^ h(y)\ 2 . 

The latter expression can be expanded as 

E y>y ,h{y)h{y')E x co^ y -^ < E y , y , |E,u/^-^|. 

Now (3(x,y — y') depends linearly on x, so ExU 13 ^^^^ is zero unless oj^ x ' v ~ v '^ is constant. 
That is, E x u^^ x ' v ~ v '^ is zero unless y — y' belongs to the kernel of (3'. Otherwise, it has 
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modulus 1. Since /3 has rank at least r, the probability, for each y, that y — y' belongs to 
the annihilator is at most p~ r . Therefore, 



—r 

p ■ 



The result follows on taking square roots. □ 

In order to show that the average over a product of quadratic phases is small, we actually 
only need one of the quadratic phases involved to have high rank. 

Lemma 6.2. Let d be a positive integer and for every pair (u, v) £ [d] 2 let f3 uv be a bilinear 
form on F™ taking variables x u and x v . For u £ [d], let <p u be a linear functional on (F£) d 
in the variable x u . Suppose that the rank of (3 UV is at least r for at least one pair (u,v). 
Then 

Proof. Let us assume first that (3 UU has rank at least r for some u. If we fix the values of 
x v for every v ^ u, then the sum in the exponent takes the form (3 uu (x u ,x u ) + j(x u ) for 
some linear functional 7. Therefore, by Lemma 13.31 the expectation over x u has modulus 
at most p~ r l 2 . Since this is true for every choice of the other x v , the whole expectation has 
modulus at most p~ r l 2 . 

Now let us assume that /3 UV has rank at least r for some pair (u, v) with u 7^ v. This 
time, let us fix all the variables apart from x u and x v . Now the sum in the exponent takes 
the form 

2(3 uv (x u , x v ) + (p(x u ) + ip{x v ) 

so by Lemma \6. II the expectation over x u and x v is at most p~ r l 2 . Again, since this is true 
for every possible choice of the other variables, the whole expectation is at most p~ r l 2 . □ 

Next we shall show that if we have a set of bilinear forms of high rank, then at least one 
of the linear combinations arising from a square-independent system Li, L2, ■ ■ ■ , L m must 
have fairly high rank. 

Lemma 6.3. Let V be a subspace of F™ and let 0i, ...,/? OT be bilinear forms on V with 
rank at least r. Let B be an invertible m x m matrix with entries £ F p . Then at least 
one of the bilinear forms rjj = Yl^Li bijfli has rank at least r/m. 

Proof. It follows from the assumption that B is invertible that = B~ x r\. But the rank of 
a linear combination of the r\i is at most the sum of the ranks of the rji. □ 
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Corollary 6.4. Suppose that Li(x) = Ylt=i c iu%u, i = 1,2, ... ,m, is a square-independent 
system. Suppose that each of the (not necessarily distinct) bilinear forms fii, i — 1, 2, . . . , m, 
has rank at least r. Then at least one of the bilinear forms j3 uv = YlJLi c iuCivPi has rank at 
least r/m. 

Proof. For each i = 1,2, ... ,m, let Mj be the matrix (c iu c iv ) UjV . Square independence 
implies that the matrices Mj are linearly independent over ¥ p . This implies that the rank of 
the d 2 xm matrix whose ((u, v),i) entry is Ci U Ci V is m. The rows of this matrix are the (dxd) 
matrices M 1; . . . , M m . The columns are the vectors C uv = {c\ u C\ v , c 2lt c 2i; , . . . , c mu c mv ). 
Since row rank equals column rank, we can find m linearly independent vectors C uv . Now 
apply Lemma 1631 to the bilinear forms (3 UV = YliLi(Cuv)i/3i to obtain the result. □ 

We are now in a position to prove the key ingredient of Theorem 11.21 

Proposition 6.5. Let C , D, R, T and 5 be positive constants. For each i = 1,2, ... ,m, let 
fi = 5_)i=i Uj^Qj be a linear combination of quadratic averages on F™ ; each of rank 
at least R and complexity at most D, such that YljLi \\Uj ||<x> — C and Y^jLi ll^j ll^a — T '. 
Let d and m be positive integers, and let L\, . . . , L m be a square-independent system of m 
linear forms in d variables. Then 



< (c m p D+s ~ ATi - R ' 2m + 5) Y[ki 



i=l 



Proof. Since / is a sum of quadratic averages, the expectation in question can be split up 
into a sum of terms of the form 

m 
i=l 

Let us obtain an upper bound for the size of one of these terms. For ease of notation, we 
shall take the sequence (ji, . . . , j m ) G [k±] x [k 2 ] x • ■ ■ x [k m ] to be the sequence {1,2, ... ,m). 
That is, we let Q\, . . . , Q m be an arbitrary sequence of quadratic averages of rank at least 
R and complexity at most D, and we let Ui an arbitrary sequence of bounded anti-uniform 
functions. We shall obtain an upper bound for the modulus of E^^pd YliiUiQi) (Li(x)) . 

By Lemma 15.21 we can find, for each % = 1, 2, . . . , m, a subspace Wi of codimension at 
most 5~ A T A such that — t^A'wJh < i n other words, each Ui can be approximated by 
a function that is constant on translates of Wi and still bounded. Set W = W\ n ■ • • D W m , 
which is a subspace of codimension at most m5~ A T 4 . Then \\Ui — Ui * /ijylh < 8 f° r all 
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i — 1, 2, . . . , m. We shall replace Ui by Ui * Hw m the above average, incurring an error of 
at most 5, and from now on focus on E xe ^ny Yii((Ui * H>w)Qi)(Li(x)). 

For each i, let us write (Vi,qi) for the base of the quadratic average Qi, and let V = 
W n V± PI • • • n V m . Then we can, if we wish, regard each Qi as a quadratic average with 
base (V,qi). Since each Vi has codimension at most D in F™ and W had codimension at 
most m<5 _4 T 4 , the codimension of V is at most m(D + 5~ 4 T 4 ), from which it follows that 
the rank of Qi, when considered in this new way, is at least R — 2m(D + 5~ 4 T A ) by Lemma 

EU 

We now split the expectation E xe ^ny HtU^ * Hw)Qi)(Li(x)) even further, according to 
the particular set of translates of V that the components x\,...,Xd of x belong to. Let 
Vi, . . . , Vd be arbitrary translates of V and let us obtain a bound for the size of 

m 

E XieVl E X2< =y 2 . . . E Xd£Vd Y[((Ui * n w )Qi)(Li(x)). 

i=l 

Now if each Xj is confined to Vj, then Li(x) is confined to some particular translate y + V 
of V. On this translate, Ui * fiw is constant, say equal to X y , and Qi(x) is given by a 
formula of the form u qi ^ x ~ y ' + ^ v ^ x ~ v > , where is linear. It follows that the expectation we 
are trying to estimate is equal to a quantity of the form E x£V d LJ^ UJ c n( L i( x ))+'f>i( L i( x )) ^ where 
each qi has rank at least R — 2m(D + <5 -4 T 4 ), and we temporarily disregard the product 
of the coefficients YYlLi \i ■ 

Now for each i = 1, 2, . . . , m, let the linear form Li{x) be given by the formula E«=i °iu x u- 
Then, writing for the bilinear form associated with qi, we have 

m dm 

^ ^ (-^i (*^) ) ^ ^ ^i u Ci v P>i{x u , Xy). 

i=l u,v=l i=l 

For each w and u, let us write (3 UV for the bilinear form EI=i c iu c ivPi- Lemma [6T41 implies 
that at least one of the bilinear forms (3 UV has rank at least R/m — 2{D + 5~ 4 T 4 ). 

Write T,ZM L ii x )) = J2T=iMz~2 d u=i c iuX u ) = Ei=i YT=i c m<M« = E^i^W- 

Then 

m 

1=1 

Applying Lemma [6.21 we may deduce that 

IE vdU Zu,vM^,x v )+ct>{x)\ < r p(D+5~ i T 4 )—R/2m 
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Since this estimate did not depend on our choice of translates of V, it follows that 
|E,e(BSf)- I[M * l*w)Qi)(Li(x))\ < C m p^ D+5 -^" R / 2m , 

i 

where the sum over the product of the constant coefficients Aj, which we had temporarily 
neglected, contributes the additional factor of C m . Since this was true for an arbitrary 
choice of quadratic averages of rank at least R, the statement of the lemma follows. □ 

Before we are able to prove Theorem II .21 we need one more standard result that will allow 
us to neglect the quadratically uniform part of the decomposition. The following statement 
is implicit in Green and Tao |GrT06] . and was also a major ingredient in |GW09aj . The 
proof is a repeated application of the Cauchy-Schwarz inequality together with a suitable 
reparametrization of the linear system under consideration. 

Theorem 6.6. Let fi, ■ ■ ■ , f m be functions F" and let L\, L2, . . . , L m be a linear system of 
Cauchy-Schwarz complexity k consisting of m forms in d variables. Then 



^a;G(F») d 



i=l 



< minll/ill^+i JJ||/j||oo. 



Let us now put all the technical results from the preceding two sections together to give 
an improved bound for Theorem \\.\\ and thus prove Theorem 11.21 

Proof of Theorem \l.S\ . Let e > 0, and let c > be chosen in terms of e later. Given 
/ : F™ — > [—1, 1] with ||/||;72 < ewe first apply Theorem 15.71 with 6\ = e/(18m) to obtain 
a decomposition 

f = fi+ 9i + h, 

where h = EjU^Q? with E; II^IL < M u Wg^ < 7S 1 and H/i^a < 25 x . We have 
carefully ensured that each quadratic average Qj has rank at last Rq for some Rq to be 
chosen later, and Mi is a function of 8i only, which can be taken to equal (2<5jf 2 ) c ' p . Recall 
that we want to show that 



=i 



is bounded in absolute value by e for sufficiently uniform /. We begin by replacing the first 
/ in the product by g\ + h\. The product involving gi yields an error term of 7Si since all 
the remaining factors have norm bounded by 1, while the product involving hi yields 
an error of 28± by Theorem 16.61 above with k = 2. Our choice of 8\ implies that the sum of 
these two errors is at most e/(2m). 
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Now we apply Theorem 15.71 again, this time with 5 2 = e/(18mMi), to obtain a decom- 
position 

/ = h + 92 + h 2 , 

where f 2 = E^f^f with £\ HC/f^U < M 2 , |M|i < 75 2 and ||/i 2 || uS < 25 2 . When 
replacing the first instance of / in 

m 
i=2 

with g 2 + h 2 , the product involving g 2 now contributes an error term of at most 75 2 Mi 
(since ||/i||oo < Mi). By Theorem 16.61 it follows that the contribution from the product 
involving h 2 is bounded above by 1b 2 M\. Therefore the total error incurred is at most 
95 2 Mi, which is at most e/(2m) by our choice of 5 2 . 

When we come to apply Theorem 15.71 to the kth instance of / in the product, we need 
to do so with 5k satisfying 9<5fcMi . . . M^_i < e/ (2m) for k = 2, . . . , m. This ensures that 
up to an error of e/2, it suffices to consider the product 

m 

\fi{Li{x)). 

i=l 

Since each is a polynomial in S^ 1 , and since Si was chosen proportional to e, it is easy 
to see that M m will be bounded above by a polynomial of e _1 . In fact, it is not difficult 
to establish that the bound on M m will be of the form c m ^e~^ Cv)m where c miP will be a 
constant depending on m and p only. 

Recall that each /■ was of the form ^J=i uf*Qf with £\ II^IU < M<, E; II ^11^ < 
Tj and each had rank at least i? an d complexity at most cZj. It is clear from the 
procedure we have applied that the parameters di, hi, Mi and T, are strictly increasing 
in i, and that we can take d m = (2/S m ) Cp , k m = (2 / S^ n ) Cp / S^, M m = (2/8^ n ) Cp and 
T m = 2 3 / 2 M 4 l 5- 3 p 3dm / i by Theorem O 

Key Proposition 16.51 with 5 = e/c~ m /4 now implies that 



t=i 



is bounded in modulus by 

^.m M m^d m +<5- 4 (2 6 Mlf ( 5- 12 p 3d '»)- J Ro/2m + 
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Let us analyse this expression. As we have already remarked, at each step Si is a polynomial 
in e, and thus the quantities d m , k m and M m are polynomial in e -1 . It follows that Rq 
needs to be taken exponential in e _1 at each step. 

More precisely, S m can be chosen of the form c miP e^ 4Cp ^ m 1 for a constant c m , v only 
depending on m and p, and hence d m can be assumed to be at most c' mp e~^ CpS)m . It 
therefore suffices to take Rq of the form exp(c" e - *- 4 ^" 1 ). 

However, in order to be able to choose Rq as the minimum rank in each decomposition 
of /, we needed the uniformity parameter c to satisfy p-p 16dmR o ; which is a function of the 
form exp(-exp(c^ p e~ (4C,p)m )). □ 



7. Remarks 

An obvious question to ask is whether the bounds in Theorem 11.21 can be improved 
further. We remark that it is possible to obtain a single exponential in Theorem 11.21 if one 
works under the assumption of the so-called Polynomial Freiman-Ruzsa Conjecture. This 
conjecture asserts that a subset A of doubling K can be covered by at most C\{K) translates 
of a subspace of size at most C^ii') where both C\(K) and C-2,(K) are polynomial in 
K (see for example |Gr05] ). It has recently been shown to be equivalent to polynomial 
bounds in Theorem 12 . 1 1 |GrT09b] . Applying a local version of this conjecture, we find that 
Si and Mi in the proof above still grow polynomially in e" 1 , while the dimension d remains 
logarithmic in e _1 , reducing the final bound to a single exponential. In the other direction, 
we do not know of a lower bound that is better than a power. 
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